Achieving cholera elimination requires adequate, representative data to inform intervention policies. In 2014, IEDCR and icddr,b established a national cholera surveillance system in Bangladesh. However, we do not know whether high-risk cholera areas are captured by this system.
In 1988, the US Centers for Disease Control and Prevention (CDC) published Guidelines for Evaluating Public Health Surveillance Systems (updated in 2001), which aimed to efficiently and effectively standardize evaluations of public-health surveillance systems using a series of broad characteristics, including representativeness and sensitivity. Representativeness is defined as the accurate description of cases over time and their distribution in a population by place and person and sensitivity is defined as the proportion of true cases or outbreaks detected by the surveillance system. Yet, to measure both indicators and validate the data collected by the surveillance system, external data are required to compare and determine the true incidence of disease in the population. Typically, such data includes medical records and registries, which rarely exist or are incomplete in low-resource settings.
Given recent estimates of V. cholerae seroincidence from a nationally-representative cross-sectional serosurvey conducted in 2015, we sought to describe the representativeness and sensitivity of the cholera surveillance system using geographically-resolved infection data. We identify how well the Bangladesh national cholera surveillance system captures
to determine which surveillance sites may be most efficiently used to deliver new interventions.
Hospitals in the national cholera surveillance system and the icddr,b Dhaka hospital are the only healthcare facilities that regularly perform laboratory confirmation of V. cholerae in Bangladesh. Consequently, it is important to understand how well areas with high cholera risk are captured by the surveillance system.
There are 23 hospital sites that perform laboratory confirmation of V. cholerae in Bangladesh (Table 1).
Table 1: Sentinel hospital IDs and locations.
| ID | Hospital | Division | Type |
|---|---|---|---|
| 1 | District Hospital Norshingdi | Dhaka | district |
| 2 | Adhunik Sadar Hospital Habiganj | Sylhet | district |
| 3 | District Sadar Hospital Cox’s Bazar | Chittagong | district |
| 4 | Adhunik Sadar Hospital Naogaon | Rajshahi | district |
| 5 | General Hospital Patuakhali | Barisal | tertiary |
| 6 | Adhunik Sadar Hospital Thakurgaon | Rangpur | district |
| 7 | District Sadar Hospital Satkhira | Khulna | district |
| 8 | Dhaka Medical College Dhaka | Dhaka | tertiary |
| 9 | Uttara Adhunik Medical College Hospital | Dhaka | tertiary |
| 10 | Bangladesh Institute of Tropical and Infectious Diseases Chittagong | Chittagong | tertiary |
| 11 | General Hospital Tangail | Dhaka | district |
| 12 | General Hospital Narayanganj | Dhaka | district |
| 13 | Sadar Hospital Chuadanga | Khulna | district |
| 14 | General Hospital Meherpur | Khulna | district |
| 15 | General Hospital Comilla | Chittagong | district |
| 16 | Upazila Health Complex Chaugachha Jesssore | Khulna | subdistrict |
| 17 | General Hospital Kusthia | Khulna | district |
| 18 | Upazila Health Complex Madan | Mymensingh | subdistrict |
| 19 | Upazila Health Complex Chhatak Sunamganj | Sylhet | subdistrict |
| 20 | Upazila Health Complex Mathbariya | Barisal | subdistrict |
| 21 | Upazila Health Complex Bakerganj | Barisal | subdistrict |
| 22 | Health Complex Shibganj | Rajshahi | subdistrict |
| 23 | icddr,b Cholera Hospital | Dhaka | icddrb |
In the absence of better data on health care utilization of the hospital sentinel surveillance sites, we assumed that the catchment areas of subdistrict, district, tertiary care, and the icddr,b Dhaka hospitals could be defined by a radii of 10-20-30-30km around each hospital (Figure 1). We refer to the joint set of buffers around all 23 hospitals as the “cholera surveillance zone” in Bangladesh.
Figure 1: Map of 10-20-30-30km buffers around subdistrict-district-tertiary-icddr,b sentinel hospital sites, respectively.
We use modeled V. cholerae estimates of seroincidence which estimate the risk of infection within the previous year relative to the population-weighted mean across a 5 km x 5 km grid of Bangladesh. These estimates were based on a nationally-representative serosurvey of Bangladesh conducted in 70 communities in 2015. We measure uncertainty by entropy which is a measure of how uncertain we are that a seroincidence hotspot (RR > 2) is a true hotspot and that a coldspot (RR < 2) is truly not a hotspot (Shannon entropy adapted by https://www.medrxiv.org/content/10.1101/2020.01.10.20016964v1.full.pdf). Our sensitivity analysis shows… [Reason why we chose RR of 2 as the threshold for entropy]
Higher entropy corresponds to high uncertainty that a grid-cell is truly a hotpot or coldspot while lower entropy corresponds to certainty. Areas that have both large uncertainty in our seroincidence estimates and that are outside of the cholera surveillance zone are “greyspots,” locations about which no current cholera-related information is known.
[Edit entropy figure to highlight sero sites that are high in seroincidence and high in entropy]
Although we have great uncertainty about the relative seroincidence risk across Bangladesh, our previous modeling efforts represent the best information we have about cholera risk in a given location. We used these modeling outputs to characterize the disease risk of populations living in V. cholerae greyspots.
Using the modeled seroincidence risk of infection with V. cholerae, we calculate three quantitative measures at the 5 km x 5 km grid cell level to represent risk of infection:
For each of these measures, we identified thresholds by which we could partition grid cells into “high,” “moderate,” and “low” risk.
We examined the distribution of relative risk and the width of the relative risk confidence interval estimates to determine the range for identifying appropriate cutoffs.
## [1] "***** What is the distribution of relative risk estimates? *****"
## [1] "summary stats on median relative risk: deciles"
## 0% 10% 20% 30% 40% 50% 60% 70%
## 0.1878425 0.5435891 0.6453819 0.7188687 0.7860784 0.8562672 0.9297886 1.0088887
## 80% 90% 100%
## 1.1000150 1.3103767 3.2647244
## [1] "What is the range of the relative risk estimates, where range is the width of the 95% CI?"
## [1] "summary stats on 1/2 range of RR: deciles"
## 0% 10% 20% 30% 40% 50% 60% 70%
## 0.4641809 1.0576451 1.2435192 1.3503377 1.4407787 1.5218541 1.6012463 1.6837145
## 80% 90% 100%
## 1.7904077 1.9615904 2.7042683
## [1] "Decision: Confidence interval ranges are too large to be useful. Arbitrarily choose relative risks of 0.8 and 1.2 as cutoffs"
We found that the confidence interval estimates were too large to be useful for this purpose. There was very high uncertainty in the relative seroincidence risk. Instead, we arbitrarily chose the relative risks of 0.8 and 1.2 as cutoffs for moderate and high risks.
We examined the distribution of the median proportion of the population infected and chose the 30th and 70th percentiles as cutoffs for moderate and high risk.
## [1] "***** What is the distribution of median proportion (of the population) infected within the last year? *****"
## [1] "summary stats on median proportion infected: deciles"
## 0% 10% 20% 30% 40% 50% 60%
## 0.03595797 0.10215108 0.12106694 0.13596771 0.14978710 0.16324129 0.17732849
## 70% 80% 90% 100%
## 0.19204748 0.21129631 0.25296893 0.63651462
## [1] "Decision: Use 30th and 70th percentiles as cutoffs"
We examined the distribution of the median number of absolute infections on the log10 scale and chose the 30th and 70th percentiles as cutoffs for moderate and high risk.
## [1] "***** What is the distribution of median infections within the last year? *****"
## [1] "summary stats on median infections: deciles"
## 0% 10% 20% 30% 40%
## 3.995863 251.563380 926.230252 2040.131620 2700.922748
## 50% 60% 70% 80% 90%
## 3291.225551 3954.858512 4758.848247 5804.953773 7447.174971
## 100%
## 252705.328285
## [1] "Decision: Use 30th and 70th percentiles as cutoffs"
We overlaid the cholera surveillance zone over population data in Bangladesh (original source: 2015 100m WorldPop estimates) to estimate the number of people living within the cholera surveillance zone (Table 2, Figure 3).
Table 2: Population living in the cholera surveillance zone.
| Buffer (Subdistrict-District-Tertiary-icddr,b in km) | Pop | Proportion |
|---|---|---|
| 10-20-30-30 | 50887409 | 0.312939 |
Areas both that have large uncertainty in our seroincidence estimates and that are outside of the cholera surveillance zone are “greyspots,” locations about which no cholera-related information is known (Figure 4). Here we show high uncertainty defined as having an entropy greater than 0.2, 0.3, and 0.4 to illustrate how the distribution of greyspots change as our understanding of uncertainty changes.
Figure 4: Cholera greyspots. The 5 km x 5 km grid cells that are colored are cells where we have reasonably confident seroincidence estimates (A. entropy < 0.2, B. entropy < 0.3, C. entropy < 0.4) or that are in the cholera surveillance zone. Grid cells in grey are greyspots, locations where we have almost no cholera information.
Using data from a nationally-representative survey across Bangladesh, we developed maps of infection across 5 km x 5km grid cells in Bangladesh. This map estimates that from the 2015 serosurvey data collection there were 25651745.4614948 infections in Bangladesh over the past year out of an estimated population of 162611264.34846. We standardized the seroincidence estimates in each cell by the population-weighted mean of the Bangladesh infection risk in that cell. This yielded a relative risk of infection for each grid cell (Figure 5).
Figure 5: Median risk of V. cholerae seroincidence relative to a population-weighted mean by 5 km x 5 km grid cell. These relative risk estimates are bounded such that RRs above and below 2 and -2 were plotted as the values 2 and -2, repsectively. The black marks indicate sentinel hospital locations.
We describe the population living in each risk category (Table 3.)
Table 3: Population living in each risk category. The population living in high, moderate, and low risk areas according to relative seroincidence risk across Bangladesh.
| Risk Level | Population |
|---|---|
| High | 15198981 |
| Moderate | 63939131 |
| Low | 83473152 |
We redrew the relative seroincidence risk map after binning grid cells into high, moderate, and low risk categories (Figure 6).
Figure 6: Cholera risk map as categorized by the risk of seroincidence relative to a population-weighted mean by 5 km x 5 km grid cell.
We overlaid the cholera surveillance zone with the binned relative seroincidence risk maps to examine the distribution of risks in the surveilled areas (Figure 7 - Figure 8).
Figure 7: Relative seroincidence risk within cholera surveillance zones (10-20-30-30km for subdistrict, district, and tertiary care, and icddr,b hospitals).
Figure 8: Relative sero incidence risk outside of cholera surveillance zones (10-20-30-30km for subdistrict, district, tertiary care, and icddr,b hospitals).
We examined the estimated number of infections, the percent infected in the cholera surveillance zone, and the percent of Bangladesh infections captured in the cholera surveillance zone (Table 4).
Table 4: Number and percent infections that may be captured in cholera surveillance zones. The percent infected represents the percentage of infected individuals captured within the cholera surveillance zone out of all infected individuals in Bangladesh.
| Buffer size | Surv. zone pop | Number infected | % infected in surv. zone pop | % of all BGD infections |
|---|---|---|---|---|
| 10-20-30-30 | 50887409 | 7047294 | 13.85 | 27.47 |
We then examined the distribution of relative-risk-based categories (Table 5).
Table 5: Number and percent infections that may be captured in cholera surveillance zones, categorized by relative risk. The infections in surveillance zone represents the percentage of infected people in high/moderate/low risk grid cells among all infections within the cholera surveillance zone. The population in surveillance zone represents the percentage of people living in high/moderate/low risk grid cells among all people within the cholera surveillance zone. The distribution should across risk categories should sum to 100% for each set of buffer sizes.
| Buffer size | Risk category | Number infected | % surv. zone infections | Surv. zone pop | % BGD pop in surv. zone |
|---|---|---|---|---|---|
| 10-20-30-30 | High | 1532923 | 21.75 | 5657869 | 11.12 |
| 10-20-30-30 | Moderate | 1923614 | 27.30 | 10439618 | 20.52 |
| 10-20-30-30 | Low | 3590756 | 50.95 | 34789922 | 68.37 |
We sought to describe how well the cholera surveillance zones capture High, Moderate, Low populations across the population of Bangladesh.
We summarized the percentage of high, moderate, and low infection risk populations in Bangladesh that would be captured by cholera surveillance zones at different buffer sizes when risk was categorized by relative risk (Table 6).
Table 6: Number and percent infections that may be captured in Bangladesh, categorized by relative risk. The captured at-risk population represents the percentage of high/moderate/low risk populations captured by the cholera surveillance zone out of all high/moderate/low risk populations in Bangladesh. The captured infections represents the percentage of infections in high/moderate/low risk grid cells among all infections in high/moderate/low risk grid cells across Bangladesh.
| Buffer size | Risk category | Surv. zone pop | Surv. zone infections | Captured At-Risk Pop (%) | Captured Infections (%) |
|---|---|---|---|---|---|
| 10-20-30-30 | High | 5657869 | 1532923 | 37.23 | 33.52 |
| 10-20-30-30 | Moderate | 10439618 | 1923614 | 16.33 | 16.63 |
| 10-20-30-30 | Low | 34789922 | 3590756 | 41.68 | 37.74 |
The second measure of risk we examine to evaluate the surveillance system is the estimated proportion of the grid cell population that was infected with V. cholerae in the year prior to data collection. This yielded the median estimated proportion of infections for each grid cell (Figure 9).
Figure 9: Median estimated proportion of grid cell population infected with V. cholerae in the previous year. The black marks indicate sentinel hospital locations.
We describe the population living in each risk category defined by the 30th and 70th percentiles of the estimated proportion of infections in each grid cell (Table 7.)
Table 7: Population living in each risk category. The population living in high, moderate, and low risk areas according to the estimated proportion of infections across Bangladesh.
| Risk Level | Population |
|---|---|
| High | 33921270 |
| Moderate | 64812197 |
| Low | 63877797 |
We redrew the estimated proportion of infections risk map after binning grid cells into high, moderate, and low risk categories (Figure 10).
Figure 10: Cholera risk map as categorized by the estimated proportion of V. cholerae infections by 5 km x 5 km grid cell.
We overlaid the cholera surveillance zone with the binned infection proportion risk maps to examine the distribution of risks in the surveilled areas (Figure 11 - Figure 12).
Figure 11: Infection proportion risk categories within cholera surveillance zones (10-20-30-30km for subdistrict, district, and tertiary care, and icddr,b hospitals).
Figure 12: Infection proportion risk categories outside of cholera surveillance zones (10-20-30-30km for subdistrict, district, tertiary care, and icddr,b hospitals).
We examined the estimated number of infections, the percent infected in the cholera surveillance zone, and the percent of Bangladesh infections captured in the cholera surveillance zone (Table 8).
Table 8: Number and percent infections that may be captured in cholera surveillance zones. The percent infected represents the percentage of infected individuals captured within the cholera surveillance zone out of all infected individuals in Bangladesh.
| Buffer size | Surv. zone pop | Number infected | % infected in surv. zone pop | % of all BGD infections |
|---|---|---|---|---|
| 10-20-30-30 | 50887409 | 7047294 | 13.85 | 27.47 |
We then examined the distribution of categories (Table 9) based on the proportion of infections.
Table 9: Number and percent infections that may be captured in cholera surveillance zones, categorized by the proportion infected. The infections in surveillance zone represents the percentage of infected people in high/moderate/low risk grid cells among all infections within the cholera surveillance zone. The population in surveillance zone represents the percentage of people living in high/moderate/low risk grid cells among all people within the cholera surveillance zone. The distribution should across risk categories should sum to 100% for each set of buffer sizes.
| Buffer size | Risk category | Number infected | % surv. zone infections | Surv. zone pop | % BGD pop in surv. zone |
|---|---|---|---|---|---|
| 10-20-30-30 | High | 2329476 | 33.05 | 9465033 | 18.60 |
| 10-20-30-30 | Moderate | 1754864 | 24.90 | 11017629 | 21.65 |
| 10-20-30-30 | Low | 2962954 | 42.04 | 30404747 | 59.75 |
We sought to describe how well the cholera surveillance zones capture High, Moderate, Low populations across Bangladesh.
We summarized the percentage of high, moderate, and low infection risk populations in Bangladesh that would be captured by cholera surveillance zones at different buffer sizes when risk was categorized by the infection proportion (Table 10).
Table 10: Number and percent infections that may be captured in Bangladesh, categorized by the proportion of individuals infected with V. cholerae . The captured at-risk population represents the percentage of high/moderate/low risk populations captured by the cholera surveillance zone out of all high/moderate/low risk populations in Bangladesh. The captured infections represents the percentage of infections in high/moderate/low risk grid cells among all infections in high/moderate/low risk grid cells across Bangladesh.
| Buffer size | Risk category | Surv. zone pop | Surv. zone infections | Captured At-Risk Pop (%) | Captured Infections (%) |
|---|---|---|---|---|---|
| 10-20-30-30 | High | 9465033 | 2329476 | 27.9 | 27.65 |
| 10-20-30-30 | Moderate | 11017629 | 1754864 | 17.0 | 16.65 |
| 10-20-30-30 | Low | 30404747 | 2962954 | 47.6 | 44.29 |
The third measure of risk we examine to evaluate the surveillance system is the median estimated number of V. cholerae infections in each grid cell. (Figure 13).
Figure 13: Median number of estimated V. cholerae infections per grid cell in the previous year. The black marks indicate sentinel hospital locations.
We describe the population living in each risk category defined by the 30th and 70th percentiles of the number of infections in each grid cell (Table 11.)
Table 7: Population living in each risk category. The population living in high, moderate, and low risk areas according to the estimated proportion of infections across Bangladesh.
| Risk Level | Population |
|---|---|
| High | 96887104 |
| Moderate | 56357517 |
| Low | 9366643 |
We redrew the estimated number of infections risk map after binning grid cells into high, moderate, and low risk categories (Figure 14).
Figure 14: Cholera risk map as categorized by the estimated number of V. cholerae infections by 5 km x 5 km grid cell.
We overlaid the cholera surveillance zone with the binned number of infections risk map to examine the distribution of risk in the surveilled areas (Figure 15 - Figure 16).
Figure 15: Number of infections risk categories within cholera surveillance zones (10-20-30-30km for subdistrict, district, and tertiary care, and icddr,b hospitals).
Figure 16: Number of infections risk categories outside of cholera surveillance zones (10-20-30-30km for subdistrict, district, tertiary care, and icddr,b hospitals).
We examined the estimated number of infections, the percent infected in the cholera surveillance zone, and the percent of Bangladesh infections captured in the cholera surveillance zone (Table 12).
Table 12: Number and percent infections that may be captured in cholera surveillance zones. The percent infected represents the percentage of infected individuals captured within the cholera surveillance zone out of all infected individuals in Bangladesh.
| Buffer size | Surv. zone pop | Number infected | % infected in surv. zone pop | % of all BGD infections |
|---|---|---|---|---|
| 10-20-30-30 | 50887409 | 7047294 | 13.85 | 27.47 |
We then examined the distribution of risk categories (Table 13) based on the number of infections.
Table 13: Number and percent infections that may be captured in cholera surveillance zones, categorized by the number of infections. The infections in surveillance zone represents the percentage of infected people in high/moderate/low risk grid cells among all infections within the cholera surveillance zone. The population in surveillance zone represents the percentage of people living in high/moderate/low risk grid cells among all people within the cholera surveillance zone. The distribution should across risk categories should sum to 100% for each set of buffer sizes.
| Buffer size | Risk category | Number infected | % surv. zone infections | Surv. zone pop | % BGD pop in surv. zone |
|---|---|---|---|---|---|
| 10-20-30-30 | High | 5259541 | 74.63 | 37576681 | 73.84 |
| 10-20-30-30 | Moderate | 1601849 | 22.73 | 11855989 | 23.30 |
| 10-20-30-30 | Low | 185903 | 2.64 | 1454739 | 2.86 |
We sought to describe how well the cholera surveillance zones capture High, Moderate, Low populations across Bangladesh.
We summarized the percentage of high, moderate, and low infection risk populations in Bangladesh that would be captured by cholera surveillance zones at different buffer sizes when risk was categorized by the number of V. cholerae infections (Table 14).
Table 14: Number and percent infections that may be captured in Bangladesh, categorized by the number of individuals infected with V. cholerae . The captured at-risk population represents the percentage of high/moderate/low risk populations captured by the cholera surveillance zone out of all high/moderate/low risk populations in Bangladesh. The captured infections represents the percentage of infections in high/moderate/low risk grid cells among all infections in high/moderate/low risk grid cells across Bangladesh.
| Buffer size | Risk category | Surv. zone pop | Surv. zone infections | Captured At-Risk Pop (%) | Captured Infections (%) |
|---|---|---|---|---|---|
| 10-20-30-30 | High | 37576681 | 5259541.2 | 38.78 | 32.82 |
| 10-20-30-30 | Moderate | 11855989 | 1601849.0 | 21.04 | 19.33 |
| 10-20-30-30 | Low | 1454739 | 185903.4 | 15.53 | 13.89 |